Discriminative Cross-View Binary Representation Learning
نویسندگان
چکیده
Learning compact representation is vital and challenging for large scale multimedia data. Cross-view/crossmodal hashing for effective binary representation learning has received significant attention with exponentially growing availability of multimedia content. Most existing crossview hashing algorithms emphasize the similarities in individual views, which are then connected via cross-view similarities. In this work, we focus on the exploitation of the discriminative information from different views, and propose an end-to-end method to learn semantic-preserving and discriminative binary representation, dubbed Discriminative Cross-View Hashing (DCVH), in light of learning multitasking binary representation for various tasks including cross-view retrieval, image-to-image retrieval, and image annotation/tagging. The proposed DCVH has the following key components. First, it uses convolutional neural network (CNN) based nonlinear hashing functions and multilabel classification for both images and texts simultaneously. Such hashing functions achieve effective continuous relaxation during training without explicit quantization loss by using Direct Binary Embedding (DBE) layers. Second, we propose an effective view alignment via Hamming distance minimization, which is efficiently accomplished by bit-wise XOR operation. Extensive experiments on two image-text benchmark datasets demonstrate that DCVH outperforms state-of-the-art cross-view hashing algorithms as well as single-view image hashing algorithms. In addition, DCVH can provide competitive performance for image annotation/tagging.
منابع مشابه
Learning Cross-View Binary Identities for Fast Person Re-Identification
In this paper, we propose to learn cross-view binary identities (CBI) for fast person re-identification. To achieve this, two sets of discriminative hash functions for two different views are learned by simultaneously minimising their distance in the Hamming space, and maximising the cross-covariance and margin. Thus, similar binary codes can be found for images of a same person captured at dif...
متن کاملCM-GANs: Cross-modal Generative Adversarial Networks for Common Representation Learning
It is known that the inconsistent distribution and representation of different modalities, such as image and text, cause the heterogeneity gap, which makes it very challenging to correlate such heterogeneous data. Recently, generative adversarial networks (GANs) have been proposed and shown its strong ability of modeling data distribution and learning discriminative representation, and most of ...
متن کاملCross-Domain Ground-Based Cloud Classification Based on Transfer of Local Features and Discriminative Metric Learning
Cross-domain ground-based cloud classification is a challenging issue as the appearance of cloud images from different cloud databases possesses extreme variations. Two fundamental problems which are essential for cross-domain ground-based cloud classification are feature representation and similarity measurement. In this paper, we propose an effective feature representation called transfer of ...
متن کاملLearning Discriminative LBP-Histogram Bins for Facial Expression Recognition
Local Binary Patterns (LBP) have been well exploited for facial image analysis recently. In the existing work, the LBP histograms are extracted from local facial regions, and used as a whole for the regional description. However, not all bins in the LBP histogram are necessary to be useful for facial representation. In this paper, we propose to learn discriminative LBP-Histogram (LBPH) bins for...
متن کاملDomain Transfer Learning for Object and Action Recognition
Title of dissertation: Domain Transfer Learning for Object and Action Recognition Jingjing Zheng, Doctor of Philosophy, 2015 Dissertation directed by: Professor Rama Chellappa Department of Electrical and Computer Engineering Visual recognition has always been a fundamental problem in computer vision. Its task is to learn visual categories using labeled training data and then identify unlabeled...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2018